1. Sample overview


2. Principal Component Analysis

Principal Component Analysis (PCA) is a method used in unsupervised analysis to reduce the dimension of large data sets and is a useful tool to explore sample clusters arising based on the expression profile. The data points, that represent the samples, are projected onto the 2D plane such that they spread out in the two directions that explain most of the differences. Before plotting variance stabilizing transformations were performed. The following figures show the PCA obtained using the variance stabilizing transformation value of all genes (Figure 1). Under ideal conditions, the samples between groups should be dispersed and the samples within groups should be gathered together.

Figure 1: Principal Component Analysis (PCA) plot for all samples. The PCA was performed on all samples using the DESeq2 transformed counts (using variance stabilizing transformation) of all genes. Dots represent individual samples (identification of subject is shown with results when cursor is at the dot).

Figure 1A: Principal Component Analysis (PCA) plot for all samples. The PCA was performed on all samples using the DESeq2 transformed counts (using variance stabilizing transformation) of all genes. Dots represent individual samples (identification of subject is shown with results when cursor is at the dot).


Principal components (PC1, PC2, etc.) significant associations with Sample Covariates are listed in Table 2. They will be used, as a design model for analysis, since of theirs significant association with PC.

Table 2: Correlations between Principal Components (PCs) and covariates.
Principal component Covariate p-value
PC2 Bio_rep 0.0058203
PC2 passage 0.0455630
PC2 Pair 0.0058203

3. Differential Gene Expression Analysis

The differential gene expression analysis was performed with DESeq2 workflow (Love et al.,2014).

The un-normalized, “raw” counts in a form of a matrix of integer values received form sequencing facility were used.

The first step was filtering of lowly expressed genes. As a threshold value Number of samples * 5 was used. The UMI read counts across all samples that were less than or equal to threshold value were filtered out and genes that pass this filter proceed further with the differential expression analysis.

Group comparisons

The listed in Table 3 sample groups comparisons were performed.

Table 3: Group Comparisons : Sample Group vs Control Group.
No Sample Group Control Group Group Comparisons
1 50uM ZED1227 mock 50uM ZED1227 VS mock
2 100 U/mL hIFNg mock 100 U/mL hIFNg VS mock
3 100 U/mL hIFNg + 50uM ZED1227 mock 100 U/mL hIFNg + 50uM ZED1227 VS mock
4 100 U/mL hIFNg + 50uM ZED1227 100 U/mL hIFNg 100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg

Differential Gene Expression Results

The results of differential regulation analysis summary are shown in Table 4. Number of up and down-regulated genes for each comparison above is defined as the Benjamini-Hochberg adjusted P-values (padj) ≤ 0.05 and log2 fold change threshold (log2FC) of (≥ |0.5|) (1.5-fold up/down-regulated).

Table 4. Summary of the results of differential regulation analysis.
No Group Comparisons # DEG # downregulated # upregulated
50uM ZED1227 VS mock 1 50uM ZED1227 VS mock 10 2 8
100 U/mL hIFNg VS mock 2 100 U/mL hIFNg VS mock 881 281 600
100 U/mL hIFNg + 50uM ZED1227 VS mock 3 100 U/mL hIFNg + 50uM ZED1227 VS mock 704 198 506
100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg 4 100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg 11 4 7

Volcano plots

The Volcano plot provides a way to perform a quick visual identification of the gene displaying large-magnitude changes which are also statistically significant. The plot is constructed by plotting -log10(padj) on the y-axis, and the log2 of the expression fold change between the two experimental groups on the x-axis. There are two regions of interest in the plot: those points that are found towards the top of the plot (high statistical significance) and at the extreme left or right (strongly down and up-regulated respectively).

50uM ZED1227 VS mock

100 U/mL hIFNg VS mock

100 U/mL hIFNg + 50uM ZED1227 VS mock

100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg


DEG tables

For each differential gene expression analysis, all significantly deferentially expressed genes are listed. Significant changes are defined as padj ≤ 0.05 and log2FC threshold of (≥ |0.5|).

Full tables are available in Tables folder or using the link: Tables folder .

50uM ZED1227 VS mock

100 U/mL hIFNg VS mock

100 U/mL hIFNg + 50uM ZED1227 VS mock

100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg


Heatmaps

A heatmap is a graphical representation of data that uses a system of color-coding to represent different values. All identified DEGs, organized as a heatmap, presented in Figure 3.

50uM ZED1227 VS mock

Figure 3: Differential gene transcriptome for  50uM ZED1227 VS mock  comparison.

Figure 3: Differential gene transcriptome for 50uM ZED1227 VS mock comparison.

100 U/mL hIFNg VS mock

Figure 3: Differential gene transcriptome for  100 U/mL hIFNg VS mock  comparison.

Figure 3: Differential gene transcriptome for 100 U/mL hIFNg VS mock comparison.

100 U/mL hIFNg + 50uM ZED1227 VS mock

Figure 3: Differential gene transcriptome for  100 U/mL hIFNg + 50uM ZED1227 VS mock  comparison.

Figure 3: Differential gene transcriptome for 100 U/mL hIFNg + 50uM ZED1227 VS mock comparison.

100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg

Figure 3: Differential gene transcriptome for  100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg  comparison.

Figure 3: Differential gene transcriptome for 100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg comparison.


Common significantly differentially expressed genes between group comparisons

To identify common significantly differentially regulated genes that occur between the different the group comparisons, Upset plot (Figure 4A) and Venn diagram (Figure 4B) are used. On Figure 4A the numbers on top of the barplot shows the number of unique genes common to the comparisons specified below using black circles.

UpSet

Figure 4A: Visualization of the intersection of significantly differentially regulated genes from all comparisons using an UpSet plot. The numbers on top of the barplot indicate the number of unique genes common to the comparisons specified below using black circles.

Venn diagram

## [[1]]

## 
## [[2]]

Figure 4B: Venn diagrams of all differentially expressed genes.

Common significant genes list


Pearson’s correlation heatmap

To check similarity between individual samples, the pairwise correlation was performed. With selected DEG genes from table 6, correlation coefficient for those genes for each pair was obtained and organized in a form of heatmap (Figure 5).

*** Figure 5. The pairwise correlation heatmap visualizes cross-correlations between the transcriptomic profiles of the samples.

4. Enrichment Analysis

GO enrichment analysis results

GO is the abbreviation of Gene Ontology, which is a major bioinformatics classification system to unify the presentation of gene properties across all species. It includes three main branches: cellular component, molecular function, and biological process. GO terms with padj < 0.05 are usually considered to be significant enrichment.

Full tables with all detected significant GO terms and annotated genes are available in Tables folder or can be downloaded using the link: Tables folder.

50uM ZED1227 VS mock

100 U/mL hIFNg VS mock

100 U/mL hIFNg + 50uM ZED1227 VS mock

100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg


GO enrichment analysis barplot

50uM ZED1227 VS mock

Figure 6: Gene Ontology terms Barplot of enriched terms in an identified set of genes showing altered expression in the  50uM ZED1227 VS mock  groups. TOP 20 are shown.

Figure 6: Gene Ontology terms Barplot of enriched terms in an identified set of genes showing altered expression in the 50uM ZED1227 VS mock groups. TOP 20 are shown.

100 U/mL hIFNg VS mock

Figure 6: Gene Ontology terms Barplot of enriched terms in an identified set of genes showing altered expression in the  100 U/mL hIFNg VS mock  groups. TOP 20 are shown.

Figure 6: Gene Ontology terms Barplot of enriched terms in an identified set of genes showing altered expression in the 100 U/mL hIFNg VS mock groups. TOP 20 are shown.

100 U/mL hIFNg + 50uM ZED1227 VS mock

Figure 6: Gene Ontology terms Barplot of enriched terms in an identified set of genes showing altered expression in the  100 U/mL hIFNg + 50uM ZED1227 VS mock  groups. TOP 20 are shown.

Figure 6: Gene Ontology terms Barplot of enriched terms in an identified set of genes showing altered expression in the 100 U/mL hIFNg + 50uM ZED1227 VS mock groups. TOP 20 are shown.

100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg

Figure 6: Gene Ontology terms Barplot of enriched terms in an identified set of genes showing altered expression in the  100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg  groups. TOP 20 are shown.

Figure 6: Gene Ontology terms Barplot of enriched terms in an identified set of genes showing altered expression in the 100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg groups. TOP 20 are shown.


Reactome Enrichment Analysis

The Reactome is a database of reactions, pathways, and biological processes, which can be used to browse pathways and submit data to a suite of data analysis tools, containing curated annotations that cover a diverse set of topics in molecular and cellular biology. Reactome terms with padj < 0.05 are considered to be significant enrichment.

Tables with significant Reactome terms and annotated genes are available in Tables folder or can be downloaded using the link: Tables folder

50uM ZED1227 VS mock

100 U/mL hIFNg VS mock

100 U/mL hIFNg + 50uM ZED1227 VS mock

100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg


Reactome enrichment analysis barplot

50uM ZED1227 VS mock

Figure 7: Reactome terms Barplot of enriched terms in an identified set of genes showing altered expression in the  50uM ZED1227 VS mock  groups. TOP 20 are shown.

Figure 7: Reactome terms Barplot of enriched terms in an identified set of genes showing altered expression in the 50uM ZED1227 VS mock groups. TOP 20 are shown.

100 U/mL hIFNg VS mock

Figure 7: Reactome terms Barplot of enriched terms in an identified set of genes showing altered expression in the  100 U/mL hIFNg VS mock  groups. TOP 20 are shown.

Figure 7: Reactome terms Barplot of enriched terms in an identified set of genes showing altered expression in the 100 U/mL hIFNg VS mock groups. TOP 20 are shown.

100 U/mL hIFNg + 50uM ZED1227 VS mock

Figure 7: Reactome terms Barplot of enriched terms in an identified set of genes showing altered expression in the  100 U/mL hIFNg + 50uM ZED1227 VS mock  groups. TOP 20 are shown.

Figure 7: Reactome terms Barplot of enriched terms in an identified set of genes showing altered expression in the 100 U/mL hIFNg + 50uM ZED1227 VS mock groups. TOP 20 are shown.

100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg

Figure 7: Reactome terms Barplot of enriched terms in an identified set of genes showing altered expression in the  100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg  groups. TOP 20 are shown.

Figure 7: Reactome terms Barplot of enriched terms in an identified set of genes showing altered expression in the 100 U/mL hIFNg + 50uM ZED1227 VS 100 U/mL hIFNg groups. TOP 20 are shown.


5. TGM2 mRNA expression and GO terms it is involved

*** Figure 8: Transglutaminase 2 (TGM2) mRNA expression.

Samples GSZs

To assess how reduced by drug TGM2 expression affects the pathways it is involved in we applied Gene Set Z-score (GSZ) calculation approach (Törönen et al., 2009) .

GSZ is defined as the difference between the error-weighted mean of the expression values of the genes in each pathway and the error-weighted mean of all genes in a sample after normalization. The number obtained shows how active a certain biological pathway or gene set in a specified sample.

To select gene sets where TGM2 is participating, from all enriched GO terms identified in step 4, only those who contain TGM2 gene were selected. Table with significant GO terms is available in Tables folder or could be downloaded using the link: TGM2 GO terms. For subsequent analysis child terms, i.e. those ones that are closer to the leaf nodes, were selected.

## Number of unique permutations: 20
## Number of unique permutations: 20Number of unique permutations: 10Number of unique permutations: 10

Figure 9. Sample GSZs for TGM2 involved GO terms.

GSZs group comparisons

To be able to compare group gene sets activity, the gene set analysis method described in Mishra et al., 2014, together with asymptotic P -values calculation was applied to our data set.

*** Figure 10. GSZ group comparisons for TGM2 involved GO terms.

6. Session information.

## R version 4.4.0 (2024-04-24 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## time zone: Europe/Helsinki
## tzcode source: internal
## 
## attached base packages:
## [1] grid      stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] zousan_0.0.0.170825         mGSZ_1.0                   
##  [3] limma_3.60.6                ismev_1.42                 
##  [5] mgcv_1.9-1                  nlme_3.1-167               
##  [7] MASS_7.3-65                 GSA_1.03.3                 
##  [9] numform_0.7.0               reshape2_1.4.4             
## [11] ReactomePA_1.48.0           org.Hs.eg.db_3.19.1        
## [13] topGO_2.56.0                SparseM_1.84-2             
## [15] graph_1.82.0                GOxploreR_1.2.7            
## [17] ggraph_2.2.1                gontr_1.1.0                
## [19] GO.db_3.19.1                AnnotationDbi_1.66.0       
## [21] ggplotify_0.1.2             VennDiagram_1.7.3          
## [23] futile.logger_1.4.3         UpSetR_1.4.0               
## [25] ComplexHeatmap_2.20.0       circlize_0.4.16            
## [27] htmltools_0.5.8.1           RCurl_1.98-1.17            
## [29] formattable_0.2.1           ggrepel_0.9.6              
## [31] writexl_1.5.2               rlist_0.4.6.2              
## [33] biomaRt_2.60.1              kableExtra_1.4.0           
## [35] DEGreport_1.40.1            htmlwidgets_1.6.4          
## [37] lubridate_1.9.4             forcats_1.0.0              
## [39] stringr_1.5.1               dplyr_1.1.4                
## [41] purrr_1.0.4                 readr_2.1.5                
## [43] tidyr_1.3.1                 tibble_3.2.1               
## [45] tidyverse_2.0.0             plotly_4.10.4              
## [47] scales_1.3.0                ggpubr_0.6.0               
## [49] ggplot2_3.5.1               DESeq2_1.44.0              
## [51] SummarizedExperiment_1.34.0 Biobase_2.64.0             
## [53] MatrixGenerics_1.16.0       matrixStats_1.5.0          
## [55] GenomicRanges_1.56.2        GenomeInfoDb_1.40.1        
## [57] IRanges_2.38.1              S4Vectors_0.42.1           
## [59] BiocGenerics_0.50.0         DT_0.33                    
## [61] readxl_1.4.5               
## 
## loaded via a namespace (and not attached):
##   [1] fs_1.6.5                    bitops_1.0-9               
##   [3] enrichplot_1.24.4           httr_1.4.7                 
##   [5] RColorBrewer_1.1-3          doParallel_1.0.17          
##   [7] tools_4.4.0                 backports_1.5.0            
##   [9] R6_2.6.1                    lazyeval_0.2.2             
##  [11] GetoptLong_1.0.5            withr_3.0.2                
##  [13] graphite_1.50.0             prettyunits_1.2.0          
##  [15] gridExtra_2.3               cli_3.6.4                  
##  [17] formatR_1.14                logging_0.10-108           
##  [19] Cairo_1.6-2                 scatterpie_0.2.4           
##  [21] network_1.19.0              labeling_0.4.3             
##  [23] sass_0.4.9                  systemfonts_1.2.1          
##  [25] yulab.utils_0.2.0           gson_0.1.0                 
##  [27] DOSE_3.30.5                 svglite_2.1.3              
##  [29] R.utils_2.13.0              rstudioapi_0.17.1          
##  [31] RSQLite_2.3.9               generics_0.1.3             
##  [33] gridGraphics_0.5-1          shape_1.4.6.1              
##  [35] crosstalk_1.2.1             car_3.1-3                  
##  [37] Matrix_1.7-3                abind_1.4-8                
##  [39] R.methodsS3_1.8.2           lifecycle_1.0.4            
##  [41] yaml_2.3.10                 edgeR_4.2.2                
##  [43] carData_3.0-5               qvalue_2.36.0              
##  [45] SparseArray_1.4.8           BiocFileCache_2.12.0       
##  [47] blob_1.2.4                  crayon_1.5.3               
##  [49] lattice_0.22-6              cowplot_1.1.3              
##  [51] annotate_1.82.0             KEGGREST_1.44.1            
##  [53] pillar_1.10.1               knitr_1.50                 
##  [55] fgsea_1.30.0                rjson_0.2.23               
##  [57] codetools_0.2-20            fastmatch_1.1-6            
##  [59] glue_1.8.0                  ggfun_0.1.8                
##  [61] data.table_1.17.0           treeio_1.28.0              
##  [63] vctrs_0.6.5                 png_0.1-8                  
##  [65] cellranger_1.1.0            gtable_0.3.6               
##  [67] cachem_1.1.0                xfun_0.51                  
##  [69] S4Arrays_1.4.1              tidygraph_1.3.1            
##  [71] ConsensusClusterPlus_1.68.0 coda_0.19-4.1              
##  [73] iterators_1.0.14            statmod_1.5.0              
##  [75] ggtree_3.12.0               bit64_4.6.0-1              
##  [77] progress_1.2.3              filelock_1.0.3             
##  [79] bslib_0.9.0                 colorspace_2.1-1           
##  [81] DBI_1.2.3                   mnormt_2.1.1               
##  [83] tidyselect_1.2.1            bit_4.6.0                  
##  [85] compiler_4.4.0              curl_6.2.1                 
##  [87] httr2_1.1.1                 xml2_1.3.8                 
##  [89] ggdendro_0.2.0              DelayedArray_0.30.1        
##  [91] shadowtext_0.1.4            psych_2.5.3                
##  [93] rappdirs_0.3.3              digest_0.6.37              
##  [95] rmarkdown_2.29              XVector_0.44.0             
##  [97] pkgconfig_2.0.3             dbplyr_2.5.0               
##  [99] fastmap_1.2.0               rlang_1.1.5                
## [101] GlobalOptions_0.1.2         UCSC.utils_1.0.0           
## [103] farver_2.1.2                jquerylib_0.1.4            
## [105] jsonlite_1.9.1              BiocParallel_1.38.0        
## [107] statnet.common_4.11.0       GOSemSim_2.30.2            
## [109] R.oo_1.27.0                 magrittr_2.0.3             
## [111] Formula_1.2-5               GenomeInfoDbData_1.2.12    
## [113] patchwork_1.3.0             munsell_0.5.1              
## [115] Rcpp_1.0.14                 ape_5.8-1                  
## [117] viridis_0.6.5               stringi_1.8.4              
## [119] zlibbioc_1.50.0             plyr_1.8.9                 
## [121] parallel_4.4.0              Biostrings_2.72.1          
## [123] graphlayouts_1.2.2          splines_4.4.0              
## [125] hms_1.1.3                   locfit_1.5-9.12            
## [127] igraph_2.1.4                ggsignif_0.6.4             
## [129] futile.options_1.0.1        XML_3.99-0.18              
## [131] evaluate_1.0.3              lambda.r_1.2.4             
## [133] tzdb_0.5.0                  foreach_1.5.2              
## [135] tweenr_2.0.3                polyclip_1.10-7            
## [137] reshape_0.8.9               clue_0.3-66                
## [139] ggforce_0.4.2               broom_1.0.7                
## [141] xtable_1.8-4                reactome.db_1.88.0         
## [143] tidytree_0.4.6              rstatix_0.7.2              
## [145] viridisLite_0.4.2           aplot_0.2.5                
## [147] memoise_2.0.1               cluster_2.1.8.1            
## [149] timechange_0.3.0